Improving Grammaticality in Statistical Sentence Generation: Introducing a Dependency Spanning Tree Algorithm with an Argument Satisfaction Model
نویسندگان
چکیده
Abstract-like text summarisation requires a means of producing novel summary sentences. In order to improve the grammaticality of the generated sentence, we model a global (sentence) level syntactic structure. We couch statistical sentence generation as a spanning tree problem in order to search for the best dependency tree spanning a set of chosen words. We also introduce a new search algorithm for this task that models argument satisfaction to improve the linguistic validity of the generated tree. We treat the allocation of modifiers to heads as a weighted bipartite graph matching (or assignment) problem, a well studied problem in graph theory. Using BLEU to measure performance on a string regeneration task, we found an improvement, illustrating the benefit of the spanning tree approach armed with an argument satisfaction model.like text summarisation requires a means of producing novel summary sentences. In order to improve the grammaticality of the generated sentence, we model a global (sentence) level syntactic structure. We couch statistical sentence generation as a spanning tree problem in order to search for the best dependency tree spanning a set of chosen words. We also introduce a new search algorithm for this task that models argument satisfaction to improve the linguistic validity of the generated tree. We treat the allocation of modifiers to heads as a weighted bipartite graph matching (or assignment) problem, a well studied problem in graph theory. Using BLEU to measure performance on a string regeneration task, we found an improvement, illustrating the benefit of the spanning tree approach armed with an argument satisfaction model.
منابع مشابه
A Generic Sentence Trimmer with CRFs
The paper presents a novel sentence trimmer in Japanese, which combines a non-statistical yet generic tree generation model and Conditional Random Fields (CRFs), to address improving the grammaticality of compression while retaining its relevance. Experiments found that the present approach outperforms in grammaticality and in relevance a dependency-centric approach (Oguro et al., 2000; Morooka...
متن کاملSearching for Grammaticality: Propagating Dependencies in the Viterbi Algorithm
In many text-to-text generation scenarios (for instance, summarisation), we encounter humanauthored sentences that could be composed by recycling portions of related sentences to form new sentences. In this paper, we couch the generation of such sentences as a search problem. We investigate a statistical sentence generation method which recombines words to form new sentences. We propose an exte...
متن کاملTowards Statistical Paraphrase Generation: Preliminary Evaluations of Grammaticality
Summary sentences are often paraphrases of existing sentences. They may be made up of recycled fragments of text taken from important sentences in an input document. We investigate the use of a statistical sentence generation technique that recombines words probabilistically in order to create new sentences. Given a set of event-related sentences, we use an extended version of the Viterbi algor...
متن کاملUnsupervised Sentence Enhancement for Automatic Summarization
We present sentence enhancement as a novel technique for text-to-text generation in abstractive summarization. Compared to extraction or previous approaches to sentence fusion, sentence enhancement increases the range of possible summary sentences by allowing the combination of dependency subtrees from any sentence from the source text. Our experiments indicate that our approach yields summary ...
متن کاملGlobal Revision in Summarisation: Generating Novel Sentences with Prim’s Algorithm
In abstract-like summarisation, extracted sentences containing key content are often revised to improve the coherence of the overall summary. In this work, we consider the task of Global Revision, in which a key sentence is revised and supplemented with additional content from the original document. Specifically, this task comprises two subtasks: selecting content; and grammatically ordering co...
متن کامل